Mosquito & Land Cover Stats#
This lesson shows how to investigate the GLOBE data, calculate statistics, and create charts & maps.
import pandas as pd
pd.set_option("display.max_columns", None)
import geopandas as gpd
import matplotlib.pyplot as plt
import seaborn as sns
import folium
Mosquito#
Let’s load the data directly from the link (no need to download anything to your computer).
mosquito = gpd.read_file('https://github.com/geo-di-lab/emerge-lessons/raw/refs/heads/main/docs/data/globe_mosquito.zip')
mosquito.head()
| CountryCode | CountryName | Elevation | AbdomenCloseupPhotoUrls | BreedingGroundEliminated | Comments | DataSource | ExtraData | Genus | GlobeTeams | LarvaFullBodyPhotoUrls | LarvaeCount | LastIdentifyStage | LocationAccuracyM | LocationMethod | MeasuredAt | MeasurementElevation | MeasurementLatitude | MeasurementLongitude | MosquitoAdults | MosquitoEggs | MosquitoHabitatMapperId | MosquitoPupae | Species | Userid | WaterSource | WaterSourcePhotoUrls | WaterSourceType | OrganizationId | OrganizationName | Protocol | SiteId | SiteName | MeasuredDate | LarvaeCountProcessed | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | BRA | Brazil | 6.3 | None | false | None | GLOBE Observer App | LarvaeVisibleNo | None | [COLUNSLZ] | None | 0 | None | 13 | automatic | 2024-12-31 17:16:00 | 0 | -2.5617 | -44.2657 | None | None | 46287 | false | None | 137422629 | ovitrap | https://data.globe.gov/system/photos/2024/12/3... | container: artificial | 17459532 | Brazil Citizen Science | mosquito_habitat_mapper | 371514 | 23MNT816168 | 2024-12-31 | 0.0 | POINT (-44.26597 -2.56197) |
| 1 | BRA | Brazil | 6.3 | None | false | None | GLOBE Observer App | LarvaeVisibleNo | None | [COLUNSLZ] | None | 0 | None | 13 | automatic | 2024-12-31 17:20:00 | 0 | -2.5617 | -44.2657 | None | None | 46290 | false | None | 137422629 | ovitrap | https://data.globe.gov/system/photos/2024/12/3... | container: artificial | 17459532 | Brazil Citizen Science | mosquito_habitat_mapper | 371514 | 23MNT816168 | 2024-12-31 | 0.0 | POINT (-44.26597 -2.56197) |
| 2 | BRA | Brazil | 7.4 | None | true | None | GLOBE Observer App | LarvaeVisibleNo | None | [COLUNSLZ] | None | 0 | None | 51 | automatic | 2024-12-31 22:32:00 | 0 | -2.5163 | -44.3023 | None | None | 46482 | false | None | 137420190 | cement, metal or plastic tank | None | container: artificial | 17459532 | Brazil Citizen Science | mosquito_habitat_mapper | 372864 | 23MNT775218 | 2024-12-31 | 0.0 | POINT (-44.30288 -2.51676) |
| 3 | BRA | Brazil | 20.6 | None | true | None | GLOBE Observer App | LarvaeVisibleNo | None | [COLUNSLZ] | None | 0 | None | 66 | automatic | 2024-12-31 00:05:00 | 0 | -2.8639 | -44.0549 | None | None | 46203 | false | None | 137419937 | can or bottle | None | container: artificial | 17459532 | Brazil Citizen Science | mosquito_habitat_mapper | 373085 | 23MPS050834 | 2024-12-31 | 0.0 | POINT (-44.05526 -2.86396) |
| 4 | BRA | Brazil | 20.6 | None | true | None | GLOBE Observer App | LarvaeVisibleNo | None | [COLUNSLZ] | None | 0 | None | 28 | automatic | 2024-12-31 00:23:00 | 0 | -2.8639 | -44.0550 | None | None | 46223 | false | None | 137419937 | lake | None | still: lake/pond/swamp | 17459532 | Brazil Citizen Science | mosquito_habitat_mapper | 373085 | 23MPS050834 | 2024-12-31 | 0.0 | POINT (-44.05526 -2.86396) |
See the list of columns:
mosquito.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 43009 entries, 0 to 43008
Data columns (total 36 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CountryCode 42925 non-null object
1 CountryName 42925 non-null object
2 Elevation 43009 non-null object
3 AbdomenCloseupPhotoUrls 874 non-null object
4 BreedingGroundEliminated 42948 non-null object
5 Comments 4001 non-null object
6 DataSource 43009 non-null object
7 ExtraData 12445 non-null object
8 Genus 4364 non-null object
9 GlobeTeams 16259 non-null object
10 LarvaFullBodyPhotoUrls 8632 non-null object
11 LarvaeCount 24689 non-null object
12 LastIdentifyStage 29908 non-null object
13 LocationAccuracyM 13811 non-null object
14 LocationMethod 18202 non-null object
15 MeasuredAt 43009 non-null datetime64[ms]
16 MeasurementElevation 42992 non-null object
17 MeasurementLatitude 42992 non-null float64
18 MeasurementLongitude 42992 non-null float64
19 MosquitoAdults 16850 non-null object
20 MosquitoEggs 16857 non-null object
21 MosquitoHabitatMapperId 43009 non-null object
22 MosquitoPupae 41207 non-null object
23 Species 1155 non-null object
24 Userid 43009 non-null object
25 WaterSource 43009 non-null object
26 WaterSourcePhotoUrls 34302 non-null object
27 WaterSourceType 43009 non-null object
28 OrganizationId 42925 non-null object
29 OrganizationName 42925 non-null object
30 Protocol 43009 non-null object
31 SiteId 43009 non-null object
32 SiteName 43009 non-null object
33 MeasuredDate 43009 non-null object
34 LarvaeCountProcessed 24686 non-null float64
35 geometry 43009 non-null geometry
dtypes: datetime64[ms](1), float64(3), geometry(1), object(31)
memory usage: 11.8+ MB
How many rows are in the dataset?
len(mosquito)
43009
There were 43,009 citizen science contributions from 2018 to 2024. Now, let’s see the number of countries where people submitted data.
len(mosquito['CountryCode'].unique())
95
Let’s see the types of the habitats (water sources) the citizen scientists recorded.
# Broader water source types
mosquito['WaterSourceType'].value_counts()
WaterSourceType
container: artificial 33166
still: lake/pond/swamp 6275
container: natural 2202
flowing: still water found next to river or stream 1366
Name: count, dtype: int64
These are the general types of water sources that citizen scientists reported to NASA. It looks like most data were collected about artificial containers. Let’s see some of the more specific types in the other column:
# More specific water source types
mosquito['WaterSource'].value_counts()
WaterSource
cement, metal or plastic tank 7528
dish or pot 4102
well or cistern 2790
jar 2399
fountain or bird bath 2350
ovitrap 2243
adult mosquito trap 2073
pond 2029
other 1915
can or bottle 1888
ditch 1886
tire 1884
animal trough or water bowl 1177
puddle or still water next to a creek, stream or river 993
flower or plant pot/tray 932
trash container 882
plant clumps (bamboo etc) 768
puddle, vehhicle or animal tracks 644
tree holes 571
public works - culvert, bridge, road 570
discarded: other 511
puddle, vehicle or animal tracks 474
swamp or wetland 467
plant husk (areca, coconut etc) 453
lake 414
rain gutter or other architectural feature 277
estuary 148
pool 128
old car or boat 123
reservoir 112
grill or outdoor appliance 108
refrigerator drainage 69
animal shell (tortoise, mollusk etc) 65
bay or ocean 36
Name: count, dtype: int64
Let’s make a pie chart using the broader column, WaterSourceType
# Here are some options for color palettes
display(sns.color_palette(palette='Set2'))
display(sns.color_palette(palette='twilight_shifted'))
display(sns.color_palette(palette='tab20'))
# Pie chart of water types
types = mosquito[['SiteId', 'WaterSourceType']].groupby('WaterSourceType', as_index=False).count()
plt.figure(figsize=(5, 5))
patches, texts = plt.pie(x = types['SiteId'],
colors = sns.color_palette('Set2'))
plt.title("GLOBE Mosquito Sightings: Water Source Types (General)")
plt.legend(patches, types['WaterSourceType'],
loc = 'center left', bbox_to_anchor=(1, 0.5), frameon=False)
plt.show()
What is the average larvae count by country?
mosquito_avg = mosquito.groupby('CountryCode')['LarvaeCountProcessed'].mean()
mosquito_avg
CountryCode
ARE 5.000000
ARG 116.108108
AUS 2.500000
BEL NaN
BEN 38.598198
...
UKR NaN
URY 0.043478
USA 667.935961
VNM 22.686567
ZAF 17.800000
Name: LarvaeCountProcessed, Length: 94, dtype: float64
Let’s make a map showing the larvae count by country. The country boundaries are from the World Food Programme.
countries = gpd.read_file('https://github.com/geo-di-lab/emerge-lessons/raw/refs/heads/main/docs/data/world_countries_general.geojson').to_crs(epsg=4326)
mosquito_avg = countries.merge(mosquito_avg, left_on='iso3', right_on='CountryCode', how='left')
fig, ax = plt.subplots(figsize = (10, 4))
mosquito_avg.plot(column = 'LarvaeCountProcessed', cmap = 'viridis',
legend = True, vmin = 0, vmax = 50, ax = ax,
missing_kwds = {'color': 'lightgrey'})
plt.title('GLOBE Mosquito Sightings: Average Larvae Count')
ax.axis('off')
plt.show()
Now, we’ll make an interactive map showing total GLOBE observations by country.
mosquito_obs = mosquito.groupby('CountryCode').size() \
.reset_index(name='GLOBE_Observations')
mosquito_obs = countries.merge(mosquito_obs, left_on='iso3', right_on='CountryCode', how='left')
map = folium.Map(location=[0, 0], zoom_start=3, tiles="CartoDB positron")
# Create the map with a color scale for the number of observations submitted to GLOBE
folium.Choropleth(
geo_data=mosquito_obs.to_json(),
name="Choropleth",
data=mosquito_obs,
columns=['name', 'GLOBE_Observations'],
key_on="feature.properties.name",
fill_color="YlGnBu",
fill_opacity=0.7,
bins=[1, 50, 100, 500, 1000, 5000, 10000, 20000],
legend_name="Number of GLOBE Observations (2018-2024)",
).add_to(map)
# Add pop-up when you hover over the area
folium.GeoJson(
geo_data=mosquito_obs.to_json(),
data=mosquito_obs,
key_on="feature.properties.name",
tooltip=folium.features.GeoJsonTooltip(fields=['name', 'GLOBE_Observations'], aliases=['Country:', 'Observations:']),
style_function=lambda feature: {'color': 'white', 'weight': 1}
).add_to(map)
display(map)